# Wikipedia pre-training

Gemma 2 9b Turkish Lora Continue Pre Trained
A LoRA-adapted model based on google/gemma-2-9b, further pre-trained with Turkish Wikipedia data to enhance Turkish text processing capabilities
Large Language Model Other
G
emre
82
4
Gemma 2b It Tamil V0.1 Alpha
Other
This is a Tamil instruction fine-tuned version of Google's Gemma 2B model, supporting bilingual text generation in English and Tamil.
Large Language Model Transformers Supports Multiple Languages
G
abhinand
38
6
Bart Large Japanese
A Japanese BART large model pre-trained on Japanese Wikipedia, suitable for text generation and natural language processing tasks.
Large Language Model Transformers Japanese
B
ku-nlp
206
10
Sbert Base Cased Pl
SHerbert is a SentenceBERT implementation based on the Polish HerBERT model, designed to generate semantically meaningful sentence embeddings, supporting sentence similarity comparison via cosine similarity.
Text Embedding Other
S
Voicelab
1,606
8
Bert Base 5lang Cased
Apache-2.0
A streamlined version of bert-base-multilingual-cased, supporting only 5 languages (English, French, Spanish, German, and Chinese), 30% smaller than the original while maintaining the same representation quality for these languages.
Large Language Model Supports Multiple Languages
B
amine
31
1
Bert Base Ja
BERT base model trained on Japanese Wikipedia dataset, suitable for masked language modeling tasks in Japanese text
Large Language Model Transformers Japanese
B
colorfulscoop
16
1
Bert Italian Finedtuned Squadv1 It Alfa
Italian BERT base version fine-tuned on Italian SQuAD for Q&A downstream tasks
Question Answering System Other
B
mrm8488
286
14
Bert Base Thai
Thai-specific pre-trained model based on BERT-Base architecture, optimized for Thai tokenization characteristics, providing superior performance compared to multilingual BERT
Large Language Model Other
B
monsoon-nlp
177
12
French Albert Base Cased
Apache-2.0
ALBERT base model pre-trained on French Wikipedia, supports case sensitivity, suitable for French NLP tasks.
Large Language Model Transformers French
F
cservan
38
0
Roberta Swedish
This is a Swedish pre-trained model based on the RoBERTa architecture, suitable for various natural language processing tasks.
Large Language Model
R
birgermoell
54
0
Roberta Base Thai Char
Apache-2.0
This is a RoBERTa model pre-trained on Thai Wikipedia text, using character-level embeddings to adapt to BertTokenizerFast.
Large Language Model Transformers Other
R
KoichiYasuoka
23
0
Chinese Bert Wwm Ext Upos
Apache-2.0
BERT model pre-trained on Chinese Wikipedia texts for POS tagging and dependency parsing.
Sequence Labeling Transformers Supports Multiple Languages
C
KoichiYasuoka
21
8
Bert Large Japanese Char Extended
This is a BERT model pre-trained on Japanese Wikipedia text, derived from bert-large-japanese-char, with enhanced character embedding capabilities to support more Kanji characters.
Large Language Model Transformers Japanese
B
KoichiYasuoka
18
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase